VitAL: Viterbi Algorithm for de novo Peptide Design
نویسندگان
چکیده
BACKGROUND Drug design against proteins to cure various diseases has been studied for several years. Numerous design techniques were discovered for small organic molecules for specific protein targets. The specificity, toxicity and selectivity of small molecules are hard problems to solve. The use of peptide drugs enables a partial solution to the toxicity problem. There has been a wide interest in peptide design, but the design techniques of a specific and selective peptide inhibitor against a protein target have not yet been established. METHODOLOGY/PRINCIPAL FINDINGS A novel de novo peptide design approach is developed to block activities of disease related protein targets. No prior training, based on known peptides, is necessary. The method sequentially generates the peptide by docking its residues pair by pair along a chosen path on a protein. The binding site on the protein is determined via the coarse grained Gaussian Network Model. A binding path is determined. The best fitting peptide is constructed by generating all possible peptide pairs at each point along the path and determining the binding energies between these pairs and the specific location on the protein using AutoDock. The Markov based partition function for all possible choices of the peptides along the path is generated by a matrix multiplication scheme. The best fitting peptide for the given surface is obtained by a Hidden Markov model using Viterbi decoding. The suitability of the conformations of the peptides that result upon binding on the surface are included in the algorithm by considering the intrinsic Ramachandran potentials. CONCLUSIONS/SIGNIFICANCE The model is tested on known protein-peptide inhibitor complexes. The present algorithm predicts peptides that have better binding energies than those of the existing ones. Finally, a heptapeptide is designed for a protein that has excellent binding affinity according to AutoDock results.
منابع مشابه
Clustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملEvolutionary algorithms and de novo peptide design
Automated de novo design of bioactive molecules is one of the aspired goals in computational chemistry. Despite significant progresses in computational approaches for ligand design and efficient evaluation of binding energy, novel procedures for ligand design are required. Evolutionary computation provides a new approach to this design issue. This paper proposes a framework for evolving ligands...
متن کاملComplexities and Algorithms for Glycan Structure Sequencing using Tandem Mass Spectrometry
Determining glycan structures is vital to comprehend cell-matrix, cell-cell, and even intracellular biological events. Glycan structure sequencing, which is to determine the primary structure of a glycan using MS/MS spectrometry, remains one of the most important tasks in proteomics. Analogous to the peptide de novo sequencing, the glycan de novo sequencing is to determine the structure without...
متن کاملMulti-spectra peptide sequencing and its applications to multistage mass spectrometry
Despite a recent surge of interest in database-independent peptide identifications, accurate de novo peptide sequencing remains an elusive goal. While the recently introduced spectral network approach resulted in accurate peptide sequencing in low-complexity samples, its success depends on the chance of presence of spectra from overlapping peptides. On the other hand, while multistage mass spec...
متن کاملMetagenome and Metatranscriptome Analyses Using Protein Family Profiles
Analyses of metagenome data (MG) and metatranscriptome data (MT) are often challenged by a paucity of complete reference genome sequences and the uneven/low sequencing depth of the constituent organisms in the microbial community, which respectively limit the power of reference-based alignment and de novo sequence assembly. These limitations make accurate protein family classification and abund...
متن کامل